Feature: PhotoBuilder — AI image generation for content pages #91

manuelkiessling · 2026-02-10T10:57:07Z

Closes #90.

PhotoBuilder — AI Image Generation for Content Pages

Summary

New vertical src/PhotoBuilder/ that generates AI-driven images matching the visual tone and content of a web page. Users launch PhotoBuilder from the Content Editor, receive AI-generated image prompts based on the page HTML, review/edit prompts, generate images, upload them to the S3 media store, and embed them back into the page — all within one integrated workflow.

Architecture

Vertical slice in src/PhotoBuilder/ following the established Domain / Infrastructure / Presentation layering, communicating with other verticals exclusively via facades.

graph LR
    PhotoBuilder -->|"readWorkspaceFile (dist HTML)"| WorkspaceMgmt
    PhotoBuilder -->|"getProjectInfo (LLM config, S3)"| ProjectMgmt
    PhotoBuilder -->|"uploadAsset (S3)"| RemoteContentAssets
    PhotoBuilder -->|"findAvailableFileNames (manifest polling)"| RemoteContentAssets
    PhotoBuilder -->|"getAccountInfoByEmail"| Account
    ChatBasedContentEditor -.->|"CTA link in dist files"| PhotoBuilder

Facade dependencies (documented in docs/vertical-wiring.md):

PhotoBuilder → WorkspaceMgmt: readWorkspaceFile (page HTML for prompt context)
PhotoBuilder → ProjectMgmt: getProjectInfo (LLM API keys, S3 credentials, provider config)
PhotoBuilder → RemoteContentAssets: uploadAsset, findAvailableFileNames (S3 upload + CDN manifest polling)
PhotoBuilder → Account: getAccountInfoByEmail (access validation)

Vertical Structure

src/PhotoBuilder/
├── Domain/
│   ├── Dto/
│   │   └── ImagePromptResultDto.php
│   ├── Entity/
│   │   ├── PhotoSession.php
│   │   └── PhotoImage.php
│   ├── Enum/
│   │   ├── PhotoSessionStatus.php
│   │   └── PhotoImageStatus.php
│   └── Service/
│       └── PhotoBuilderService.php
├── Infrastructure/
│   ├── Adapter/
│   │   ├── PromptGeneratorInterface.php
│   │   ├── OpenAiPromptGenerator.php
│   │   ├── ImageGeneratorInterface.php
│   │   ├── ImageGeneratorFactory.php
│   │   ├── OpenAiImageGenerator.php
│   │   ├── GeminiImageGenerator.php
│   │   ├── ImagePromptAgent.php
│   │   └── PatchedGemini.php
│   ├── Handler/
│   │   ├── GenerateImagePromptsHandler.php
│   │   └── GenerateImageHandler.php
│   ├── Message/
│   │   ├── GenerateImagePromptsMessage.php
│   │   └── GenerateImageMessage.php
│   └── Storage/
│       └── GeneratedImageStorage.php
├── Presentation/
│   ├── Controller/
│   │   └── PhotoBuilderController.php
│   └── Resources/
│       ├── assets/controllers/
│       │   ├── photo_builder_controller.ts
│       │   └── photo_image_controller.ts
│       └── templates/
│           └── photo_builder.twig
└── TestHarness/
    ├── FakeImageGenerator.php
    └── FakePromptGenerator.php

Domain Layer

Entities

PhotoSession — tracks one photo generation session per page:

id (UUID), workspaceId, conversationId, pagePath
systemPrompt, userPrompt (LLM prompt context)
status (enum: generating_prompts, prompts_ready, generating_images, images_ready, failed)
createdAt

PhotoImage — tracks each generated image:

id (UUID), session (ManyToOne → PhotoSession), position
prompt, suggestedFileName (LLM-generated)
status (enum: pending, generating, completed, failed)
storagePath (relative path in var/photo-builder/)
uploadedToMediaStoreAt, uploadedFileName (S3 upload tracking)
errorMessage

Service

PhotoBuilderService orchestrates session lifecycle: creates sessions with IMAGE_COUNT empty image slots, updates prompts from LLM output, coordinates status transitions, and respects "keep" flags during prompt regeneration.

Infrastructure Layer

Multi-Provider LLM Support (OpenAI + Google Gemini)

The plan originally assumed OpenAI only. The implementation introduces a two-tier, multi-provider LLM configuration:

Content Editing (OpenAI-only): existing chat-based content editor
PhotoBuilder (OpenAI or Google Gemini): configurable per project, with fallback to content editing settings

Prompt generation uses a NeuronAI Agent with a deliver_image_prompt tool — the LLM calls this tool once per image, delivering structured {prompt, file_name} pairs. This tool-based approach avoids fragile JSON parsing. The agent supports both OpenAI and Gemini providers, parameterized via ImagePromptAgent.

Image generation uses direct HTTP calls:

OpenAiImageGenerator: OpenAI Images API (gpt-image-1, b64_json response format)
GeminiImageGenerator: Google Gemini API with native image generation (supports lo-res 1024px and hi-res 2048px modes)
ImageGeneratorFactory: selects the appropriate generator based on project configuration

PatchedGemini provider: NeuronAI's built-in Gemini provider only checks parts[0] for function calls, but Gemini 3 models return text/thought parts before function call parts. PatchedGemini scans all parts and reindexes correctly.

Async Processing (Symfony Messenger)

Two message/handler pairs, dispatched through the immediate transport:

GenerateImagePromptsHandler: loads session, reads page HTML via facade, runs prompt agent, updates images with prompts + filenames, dispatches individual image generation messages (respects "keep" flags)
GenerateImageHandler: generates a single image via the selected provider, saves to disk, updates entity, checks if all session images are done

Messenger consumer scaled to 5 replicas (docker-compose.yml) for parallel image processing.

Image Storage

GeneratedImageStorage: filesystem adapter at var/photo-builder/{sessionId}/{position}.png with save/read/getAbsolutePath methods.

Presentation Layer

Controller

PhotoBuilderController with routes for:

Page rendering (GET /photo-builder/{workspaceId})
Session management (create, poll status, regenerate prompts)
Image operations (regenerate single image, update prompt, serve file, upload to S3)
Manifest availability check (polls CDN manifests before redirect)

Access control via #[IsGranted('ROLE_USER')] with workspace/project ownership verification.

Frontend (Stimulus + Twig)

Two-controller architecture:

photo_builder_controller.ts — page orchestrator managing session lifecycle, polling, global state, and inter-controller coordination
photo_image_controller.ts — per-card controller for individual image state, prompt editing, and UI feedback

Key UX features implemented beyond the original plan:

Lo-res/hi-res resolution toggle (Google Gemini only) — fast iteration vs. higher quality
Upload feedback — per-image spinner + "Uploaded" checkmark on single-image upload, overlay + "Uploading images, please wait..." on bulk embed
Manifest availability polling — after S3 upload, polls CDN manifests (3s intervals, 90s timeout) before redirecting to content editor, preventing broken image references
User prompt preservation — local edits in the "Additional image style instructions" textarea are not overwritten by poll responses
Cache-busting — timestamp suffix on image URLs after regeneration to defeat browser cache
Prompt language — generated prompts match the page's locale
"Keep prompt" during regeneration — users can protect individual prompts from being overwritten when regenerating all prompts
Parent-to-child event dispatch — DOM events bubble upward only; parent dispatches events directly on child elements (pattern documented in docs/frontendbook.md)

Template

photo_builder.twig — responsive image grid with loading overlay, user prompt section, per-image cards (preview, prompt textarea, keep checkbox, regenerate/upload buttons), and embed CTA. Uses etfswui-* styleguide classes throughout.

Content Editor Integration

PhotoBuilder CTA in dist_files_controller.ts: camera icon next to each page file that navigates to PhotoBuilder
Prefilled chat message: after embedding, navigates back to content editor with ?prefill=Embed images a.jpg, b.jpg into page x.html query param, pre-filling the instruction textarea
Chat-based content editor: reads prefillMessage from URL and populates the input

Project Settings: Hierarchical LLM Configuration

The existing single llmApiKey/llmModelProvider fields were renamed to contentEditingLlmApiKey/contentEditingLlmModelProvider (scoped to content editing). New optional photoBuilderLlm* fields were added with automatic fallback to content editing settings.

Project settings UI (project_form.twig) extended with:

Option A: "Use same settings as Content Editing" (default, one-click)
Option B: "Use dedicated provider" with provider radio (OpenAI/Google), API key input, and verification button
LLM key verification controller updated to resolve provider from the nearest fieldset

Model selection:

OpenAI: gpt-image-1 for image generation
Google: gemini-3-pro-image-preview for image generation, gemini-3-flash-preview for prompt generation

TestHarness

src/PhotoBuilder/TestHarness/ provides fake adapters for local development:

FakePromptGenerator: returns canned prompts without calling an LLM
FakeImageGenerator: generates placeholder images without API calls
Toggled via .env flags: PHOTO_BUILDER_SIMULATE_IMAGE_PROMPT_GENERATION, PHOTO_BUILDER_SIMULATE_IMAGE_GENERATION

Database Migrations

4 migrations:

Version20260210112717 — create photo_sessions and photo_images tables
Version20260211081136 — add uploaded_to_media_store_at to photo_images
Version20260211082223 — add uploaded_file_name to photo_images
Version20260211110000 — rename LLM fields to scoped names, add PhotoBuilder-specific LLM columns

Cross-Cutting Concerns

CSRF protection: token generated in Twig, passed to Stimulus as value, validated on all POST endpoints
Access control: #[IsGranted('ROLE_USER')] + workspace/project ownership verification
DateAndTimeService: used for all entity timestamps (no new DateTimeImmutable())
LLM wire logging: prompt agent supports wire logger for debuggability
Translations: full EN + DE coverage for all PhotoBuilder UI strings
Language switcher: preserves page and conversationId query params when switching locale

Documentation

docs/vertical-wiring.md updated with PhotoBuilder facade dependencies
docs/frontendbook.md updated with parent-to-child event dispatch pattern
docs/llm-usage-book.md added — documents all LLM concerns and provider configuration

Test Coverage

PHP unit tests (in tests/Unit/PhotoBuilder/):

PhotoSession, PhotoImage entities
PhotoBuilderService
GeneratedImageStorage
OpenAiImageGenerator, GeminiImageGenerator
PatchedGemini provider
RemoteContentAssetsFacade (findAvailableFileNames)

Frontend tests (Vitest, in tests/frontend/unit/PhotoBuilder/):

photo_builder_controller.test.ts — session lifecycle, polling, prompt regeneration, upload flow, manifest polling, resolution toggle
photo_image_controller.test.ts — state updates, prompt editing, keep checkbox, button state management, cache-busting
Plus additional tests in dist_files_controller.test.ts (PhotoBuilder CTA) and chat_based_content_editor_controller.test.ts (prefill message)

Stats: 78 files changed, ~8,900 lines added, ~340 lines removed.

Made with Cursor

… unit tests New vertical for AI image generation matching web page content: - Domain: PhotoSession/PhotoImage entities, enums, PhotoBuilderService with IMAGE_COUNT constant - Infrastructure: PromptGenerator (NeuronAI agent with deliver_image_prompt tool), ImageGenerator (OpenAI Images API), GeneratedImageStorage, Messenger messages/handlers - Tests: 45 unit tests covering entities, service logic, storage, and image generator Co-authored-by: Cursor <cursoragent@cursor.com>

…translations - PhotoBuilderController with all API endpoints (create session, poll, regenerate, serve image, upload to media store) - Twig template with loading state, user prompt, responsive image grid, media store sidebar - Two Stimulus controllers: photo_builder_controller.ts (orchestrator) and photo_image_controller.ts (per-card state management) - EN+DE translations for all PhotoBuilder UI strings - ImagePromptResultDto to replace associative arrays at boundaries - Registered new controllers in bootstrap.ts and asset_mapper.yaml - Service wiring in services.yaml, Twig namespace in twig.yaml - All quality checks pass (PHPStan, ESLint, tsc, Prettier, PHP CS Fixer) Co-authored-by: Cursor <cursoragent@cursor.com>

…s for PhotoBuilder - Wire PhotoBuilder CTA (camera icon) into dist_files_controller for each page - Add prefillMessage support to chat-based-content-editor controller for the "Embed generated images into content page" flow - Register PhotoBuilder entities in doctrine.yaml and generate migration for photo_sessions and photo_images tables - Add Vitest tests for photo_builder_controller (23 tests) and photo_image_controller (25 tests) - Add tests for PhotoBuilder CTA in dist_files_controller (5 tests) and prefillMessage in chat_based_content_editor_controller (3 tests) Co-authored-by: Cursor <cursoragent@cursor.com>

…ms, image serving - Replace invalid placeholder strings (___SESSION_ID___) in Twig template with dummy UUIDs that satisfy Symfony route parameter requirements - Use output_format instead of response_format for gpt-image-1 API (response_format is a dall-e-2/dall-e-3 parameter) - Generate image URLs via Symfony router to include locale prefix, fixing broken image display due to missing /{_locale}/ in path - Update vertical-wiring.md with PhotoBuilder facade dependencies - Update corresponding unit and frontend tests Co-authored-by: Cursor <cursoragent@cursor.com>

…dback, TestHarness - Use etfswui-* styleguide classes on PhotoBuilder page (buttons, cards, forms) - Add cursor-pointer to all CTAs via styleguide button classes - Extract Remote Assets sidebar to @common.presentation/_remote_asset_browser_sidebar.html.twig - Include shared partial in chat_based_content_editor and photo_builder - Show 'Upload has been finished' banner on PhotoBuilder when upload completes (auto-hide 5s) - Add PhotoBuilder TestHarness: FakePromptGenerator, FakeImageGenerator, env toggles - PHOTO_BUILDER_SIMULATE_IMAGE_PROMPT_GENERATION and PHOTO_BUILDER_SIMULATE_IMAGE_GENERATION in .env - Fix OpenAI image API (output_format for gpt-image-1), poll image URLs, Stimulus action wiring - IMAGE_COUNT=1 for faster testing; docs/frontendbook.md and vertical-wiring.md updates Co-authored-by: Cursor <cursoragent@cursor.com>

…er query params - Show 'Upload has been finished' banner when image-card upload succeeds (not only sidebar) - Regenerate prompts: overlay + spinner, clear unprotected prompt textareas on start - Hide overlay when poll returns generating state; add regenerating_prompts translation - Language switcher: preserve query string (page, conversationId) when switching locale on photo builder Co-authored-by: Cursor <cursoragent@cursor.com>

- Add uploadedToMediaStoreAt to PhotoImage to track S3 uploads - Persist upload state in uploadToMediaStore endpoint; idempotent when already uploaded - Include uploadedToMediaStore in poll response - Change embedIntoPage to async: upload non-uploaded images first, show 'Uploading images, please wait...' overlay, then navigate on success - Add translations for uploading_images (EN/DE) - Reset uploadedToMediaStoreAt when image is regenerated Co-authored-by: Cursor <cursoragent@cursor.com>

…prompt - Add uploadedFileName to PhotoImage for hash-prefixed S3 names in embed message - Pass keepImageIds from regenerate prompts to handler; skip regenerating kept images - Only dispatch image generation for changed prompts, not kept ones - Clear uploaded state when prompt is regenerated Co-authored-by: Cursor <cursoragent@cursor.com>

Dispatch clearPromptIfNotKept event on each child card element instead of the parent — DOM events bubble upward, so dispatching on the parent never reached child controllers. Also show "Generating..." text with pulse animation immediately on non-kept prompts, disable buttons during regeneration, and document the parent-to-child event pattern in frontendbook. Co-authored-by: Cursor <cursoragent@cursor.com>

…ider configuration Introduce a two-tier LLM configuration system: content editing (OpenAI-only) and PhotoBuilder (OpenAI or Google Gemini). Projects can either reuse content editing settings for image generation or configure a dedicated provider/key. - Rename llmApiKey/llmModelProvider to contentEditing* scope across entity, DTOs, facades, controllers, templates, and tests - Add nullable photoBuilder* LLM fields with fallback to content editing - Extend LlmModelProvider enum with Google case and model selection methods - Extend LlmModelName enum with gpt-image-1, gemini-3-pro-preview, gemini-3-pro-image-preview - Implement GeminiImageGenerator adapter and ImageGeneratorFactory - Parameterize ImagePromptAgent to support both OpenAI and Gemini providers - Add Google API key verification via Gemini models endpoint - Add PhotoBuilder LLM settings UI (Option A: reuse / Option B: dedicated) with provider selection, key input, verification, and one-click reuse - Display active provider and model names on PhotoBuilder page - Add docs/llm-usage-book.md documenting all LLM concerns and configuration Co-authored-by: Cursor <cursoragent@cursor.com>

The Stimulus controller searched for the provider radio only within its own element, missing sibling radios in the same fieldset. This caused Google Gemini keys to be verified against OpenAI, always failing. Widen the lookup scope to the closest fieldset/form ancestor. Co-authored-by: Cursor <cursoragent@cursor.com>

…nly) Lo-res mode (1K, default) enables faster iteration; hi-res mode (2K) produces higher quality output. The toggle is only shown when the effective PhotoBuilder provider is Google Gemini, since OpenAI always generates 1024x1024. Switching modes re-generates all images client-side using current prompts at the new resolution without a page reload. Co-authored-by: Cursor <cursoragent@cursor.com>

Remove fixed container_name to allow scaling, add deploy.replicas: 5. Co-authored-by: Cursor <cursoragent@cursor.com>

After uploading images to S3 via the "Embed into page" action, poll the remote asset manifests until all uploaded filenames are confirmed available before redirecting. This prevents the content editor from referencing images that haven't propagated to the CDN yet. - Add findAvailableFileNames() to RemoteContentAssetsFacade (basename matching against merged manifests) so the logic stays in the RemoteContentAssets vertical - Add thin POST endpoint in PhotoBuilderController that delegates to the facade and returns { available, allAvailable } - Frontend polls every 3s for up to 90s, showing a spinner overlay - Includes PHP unit tests, frontend tests, and EN/DE translations Co-authored-by: Cursor <cursoragent@cursor.com>

Use the faster and cheaper Flash model for generating image prompts in PhotoBuilder when the Google provider is selected. Pro remains the main text model for content editing. Co-authored-by: Cursor <cursoragent@cursor.com>

NeuronAI's Gemini provider only checks parts[0] for functionCall, but Gemini 3 models now return text/thought parts before functionCall parts, causing all tool calls to be silently missed (0 prompts). Introduce PatchedGemini provider that scans all parts and reindexes the tools array after filtering out non-functionCall parts. Co-authored-by: Cursor <cursoragent@cursor.com>

…from PhotoBuilder

- Add in-progress and success feedback near single-image Upload CTA (spinner + 'Uploading…', then checkmark + 'Uploaded'; dispatch uploadComplete/uploadFailed to card) - Fix upload/success spans always visible: use wrapper spans so only 'hidden' is toggled (no inline-flex vs hidden conflict) - Find card from event target for reliable completion/failure delivery - Add translations: uploading_to_media_store, uploaded_to_media_store - Stack Regenerate and Upload buttons vertically (flex-col) to fit space Co-authored-by: Cursor <cursoragent@cursor.com>

…ons textarea - Track lastAppliedUserPrompt; only apply server userPrompt from poll when current value matches it (or first load) so local edits are not overwritten - Add unit test: user edit preserved when poll runs with textarea unfocused Co-authored-by: Cursor <cursoragent@cursor.com>

The change-detection optimization skipped dispatching stateChanged to children when per-image data was unchanged, but children rely on that event to re-read the parent's data-photo-builder-generating attribute and enable/disable their Regenerate and Upload buttons. Now tracks anyGenerating transitions and force-dispatches to all cards when it changes. Co-authored-by: Cursor <cursoragent@cursor.com>

…uster The cache-buster is only applied when img.src is actually set (after a regeneration cycle clears lastSetImageUrl), so repeated polls with the same URL still skip re-assignment and avoid redundant fetches. Co-authored-by: Cursor <cursoragent@cursor.com>

…poll The promptAwaitingRegenerate logic only accepted new prompts when status was "pending" or "generating". If image generation completed within one poll cycle, status was already "completed" and the condition never matched, leaving the textarea permanently stuck. Now compares the incoming prompt against the saved pre-regeneration prompt instead of checking status, correctly handling both fast completions and stale old-data polls. Co-authored-by: Cursor <cursoragent@cursor.com>

- Redesign Preview Pages and PhotoBuilder CTA as distinct styleguide cards - Add translatable Edit HTML / Preview labels with proper icons - Make filenames clickable links to preview URLs - Move AI model info into Behind the Scenes section - Translate embed prefill message for German locale - Right-align Content Editor action buttons, use styleguide classes - Use etfswui-card-back-link for PhotoBuilder back link - Remove unused flex-row sidebar layout so content uses full width Co-authored-by: Cursor <cursoragent@cursor.com>

manuelkiessling marked this pull request as draft February 10, 2026 10:57

manuelkiessling self-assigned this Feb 10, 2026

manuelkiessling added the enhancement New feature or request label Feb 10, 2026

manuelkiessling and others added 26 commits February 10, 2026 12:19

wip

6353558

prompt lang matches page language

5490fd6

Cleanups

efa44d8

Scale messenger consumer to 5 replicas for parallel message processing

8e00446

Remove fixed container_name to allow scaling, add deploy.replicas: 5. Co-authored-by: Cursor <cursoragent@cursor.com>

Switch Google image prompt generation to gemini-3-flash-preview

108a36d

Use the faster and cheaper Flash model for generating image prompts in PhotoBuilder when the Google provider is selected. Pro remains the main text model for content editing. Co-authored-by: Cursor <cursoragent@cursor.com>

Fixed remote asset browser sidebar template location; removed widget …

8f4721e

…from PhotoBuilder

WIP: Cleanups

1310430

manuelkiessling marked this pull request as ready for review February 11, 2026 18:41

manuelkiessling merged commit a80f17b into main Feb 11, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: PhotoBuilder — AI image generation for content pages #91

Feature: PhotoBuilder — AI image generation for content pages #91

Uh oh!

manuelkiessling commented Feb 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feature: PhotoBuilder — AI image generation for content pages #91

Feature: PhotoBuilder — AI image generation for content pages #91

Uh oh!

Conversation

manuelkiessling commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PhotoBuilder — AI Image Generation for Content Pages

Summary

Architecture

Vertical Structure

Domain Layer

Entities

Service

Infrastructure Layer

Multi-Provider LLM Support (OpenAI + Google Gemini)

Async Processing (Symfony Messenger)

Image Storage

Presentation Layer

Controller

Frontend (Stimulus + Twig)

Template

Content Editor Integration

Project Settings: Hierarchical LLM Configuration

TestHarness

Database Migrations

Cross-Cutting Concerns

Documentation

Test Coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

manuelkiessling commented Feb 10, 2026 •

edited

Loading